65 research outputs found

    The hidden sexual minorities: machine learning approaches to estimate the sexual minority orientation among Beijing college students

    Get PDF
    Based on the fourth-wave Beijing College Students Panel Survey (BCSPS), this study aims to provide accurate estimation of the percentage of the potential sexual minorities among the Beijing college students by using machine learning methods. Specifically, we employ random forest (RF), an ensemble learning approach for classification and regression, to predict the sexual orientation of those who were not willing to disclose his/her inherent sexual identity. To overcome the imbalance problem arising from far different numerical proportion of sexual minority and majority members, we adopt the repeated random sub-sampling for training set by partitioning those who expressed heterosexual orientation into different number of splits and further combining each split with those who expressed sexual minority orientation. The prediction from 24-split random forest suggests that youths in Beijing with sexual minority orientation amount to 5.71%, almost two times that of the original estimation 3.03%. The results are robust to alternative learning methods and covariate sets. Besides, it is also suggested that random forest outperforms other learning algorithms, including AdaBoost, Naive Bayes, support vector machine (SVM), and logistic regression, in dealing with missing data, by showing higher accuracy, F1 score, and area under curve (AUC) value

    Wavelet-based density estimation for noise reduction in plasma simulations using particles

    Full text link
    For given computational resources, the accuracy of plasma simulations using particles is mainly held back by the noise due to limited statistical sampling in the reconstruction of the particle distribution function. A method based on wavelet analysis is proposed and tested to reduce this noise. The method, known as wavelet based density estimation (WBDE), was previously introduced in the statistical literature to estimate probability densities given a finite number of independent measurements. Its novel application to plasma simulations can be viewed as a natural extension of the finite size particles (FSP) approach, with the advantage of estimating more accurately distribution functions that have localized sharp features. The proposed method preserves the moments of the particle distribution function to a good level of accuracy, has no constraints on the dimensionality of the system, does not require an a priori selection of a global smoothing scale, and its able to adapt locally to the smoothness of the density based on the given discrete particle data. Most importantly, the computational cost of the denoising stage is of the same order as one time step of a FSP simulation. The method is compared with a recently proposed proper orthogonal decomposition based method, and it is tested with three particle data sets that involve different levels of collisionality and interaction with external and self-consistent fields

    Social prediction: a new research paradigm based on machine learning

    Get PDF
    Sociology is a science concerned with both the interpretive understanding of social action and the corresponding causal explanation, process, and result. A causal explanation should be the foundation of prediction. For many years, due to data and computing power constraints, quantitative research in social science has primarily focused on statistical tests to analyze correlations and causality, leaving predictions largely ignored. By sorting out the historical context of "social prediction," this article redefines this concept by introducing why and how machine learning can help prediction in a scientific way. Furthermore, this article summarizes the academic value and governance value of social prediction and suggests that it is a potential breakthrough in the contemporary social research paradigm. We believe that through machine learning, we can witness the advent of an era of a paradigm shift from correlation and causality to social prediction. This shift will provide a rare opportunity for sociology in China to become the international frontier of computational social sciences and accelerate the construction of philosophy and social science with Chinese characteristics
    • …
    corecore